Data Mining Techniques

نویسندگان

  • Mohammed J. Zaki
  • Limsoon Wong
چکیده

Data mining is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. Traditional data analysis is assumption driven in the sense that a hypothesis is formed and validated against the data. Data mining, in contrast, is data driven in the sense that patterns are automatically extracted from data. The goal of this tutorial is to provide an introduction to data mining techniques. The focus will be on methods appropriate for mining massive datasets using techniques from scalable and high performance computing. The techniques covered include association rules, sequence mining, decision tree classification, and clustering. Some aspects of preprocessing and postprocessing are also covered. The problem of predicting contact maps for protein sequences is used as a detailed case study. The material presented here is compiled by LW based on the original tutorial slides of MJZ at the 2002 Post-Genome Knowledge Discovery Programme in Singapore.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Credit scoring in banks and financial institutions via data mining techniques: A literature review

This paper presents a comprehensive review of the works done, during the 2000–2012, in the application of data mining techniques in Credit scoring. Yet there isn’t any literature in the field of data mining applications in credit scoring. Using a novel research approach, this paper investigates academic and systematic literature review and includes all of the journals in the Science direct onli...

متن کامل

Prediction of Student Learning Styles using Data Mining Techniques

This paper focuses on the prediction of student learning styles using data mining techniques within their institutions. This prediction was aimed at finding out how different learning styles are achieved within learning environments which are specifically influenced by already existing factors. These learning styles, have been affected by different factors that are mainly engraved and found wit...

متن کامل

Using data mining techniques for predicting the survival rate of breast cancer patients: a review article

    This review was conducted between December 2018 and March 2019 at Isfahan University of Medical Sciences. A review of various studies revealed what data mining techniques to predict the probability of survival, what risk factors for these predictions, what criteria for evaluating data mining techniques, and finally what data sources for it have been used to predict the surv...

متن کامل

Identification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms

In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...

متن کامل

Identification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms

In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...

متن کامل

A Geometric View of Similarity Measures in Data Mining

The main objective of data mining is to acquire information from a set of data for prospect applications using a measure. The concerning issue is that one often has to deal with large scale data. Several dimensionality reduction techniques like various feature extraction methods have been developed to resolve the issue. However, the geometric view of the applied measure, as an additional consid...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003